Built by aligning high-quality genomes, saved as paths through the pangenome.
Human Pangenome Reference Consortium (HPRC)
Liao, Asri, Ebler, et al. Nature 2023
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
Separable: splitting the node into its two node sides separates a subgraph from the graph
Minimal: there are no nodes within the snarl that are separable with either boundary node side
A snarl is a subgraph bounded by two node sides that are:
Separable: splitting the node into its two node sides separates a subgraph from the graph
Minimal: there are no nodes within the snarl that are separable with either boundary node side
A run of consecutive snarls and nodes is called a chain
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Netgraphs are a representation of snarls with their child chains collapsed into a single node
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)Trick for getting this snarl decomposition to look better (currently only for the distance index):
vg index -j [graph.dist] -w 6
vcf + graph + decomposition for this graph
vg giraffeShort reads
Long reads
On the HPRC v2 graph which is x size?
vg graph formats and indexesIndexes
.gbwt (Graph Burrows Wheeler
Transform): haplotype paths.gg (GBWT Graph): node sequences for a
GBWT.dist (Distance Index): snarl
decomposition plus minimum distances.zipcodes: per-node distance
information used by vg giraffe.min (Minimizer Index): minimizers
used by vg giraffe.gcsa (Generalized Compressed Suffix
Array): substring index used by vg map and
vg mpmapGraphs
.gbz (GBWT + GG): the graph induced by
the GBWT.hg (/.vg) (HashGraph):
graph format optimized for speed.pg (/.vg) (PackedGraph):
graph format optimized for space efficiency.xg: older graph format.vg: protobuf-based graph formatvg wiki
vg manpage: https://github.com/vgteam/vg/wiki/vg-manpage
snarls paper doi: 10.1089/cmb.2017.0251
short read giraffe paper doi: 10.1126/science.abg8871
long read giraffe paper doi: 10.1101/2025.09.29.678807